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ABSTRACT 

We use a semi-analytic model of galaxy formation in hierarchical clustering theories to interpret recent 
data on galaxy formation and evolution, focussing primarily on the recently discovered population of 
Lyman-break galaxies at z ~ 3. For a variety of cold dark matter (CDM) cosmologies we construct 
mock galaxy catalogues subject to identical selection criteria to those applied to the real data. We find 
that the expected number of Lyman-break galaxies is very sensitive to the assumed stellar initial mass 
function and to the normalization of the primordial power spectrum. For reasonable choices of these and 
other model parameters, it is possible to reproduce the observed abundance of Lyman-break galaxies 
in CDM models with fio = 1 and with Jig < 1- The characteristic masses, circular velocities and star- 
formation rates of the model Lyman-break galaxies depend somewhat on the values of the cosmological 
parameters but are broadly in agreement with available data. These galaxies generally form from rare 
peaks at high redshift and, as a result, their spatial distribution is strongly biased, with a typical bias 
parameter, & ~ 4, and a comoving correlation length, tq ~ 4/i~^Mpc. The typical sizes of these galaxies, 
~ 0.5/i~^kpc, are substantially smaller than those of present day bright galaxies. In combination with 
data at lower redshifts, the Lyman-break galaxies can be used to trace the cosmic star formation history. 
We compare theoretical predictions for this history with a compilation of recent data. The observational 
data match the theoretical predictions reasonably well, both for the distribution of star formation rates 
at various redshifts and for the integrated star formation rate as a function of redshift. Most galaxies 
(in our models and in the data) never experience star formation rates in excess of a few solar masses 
per year. Our models predict that even at z = 5, the integrated star formation rate is similar to that 
measured locally, although less than 1% of all the stars have formed prior to this redshift. The weak 
dependence of the predicted star formation histories on cosmological parameters allows us to propose 
a fairly general interpretation of the significance of the Lyman-break galaxies as the first galaxy-sized 
objects that experience significant amounts of star formation. These galaxies mark the onset of the epoch 
of galaxy formation that continues into the present day. The basic ingredients of a consistent picture of 
galaxy formation may well be now in place. 

Subject headings: galaxies: evolution - galaxies:formation - galaxies:fundamcntal parameters 



1. INTRODUCTION 

Observational studies of galaxy formation and evolution 
have progressed at a breathtaking pace over the past cou- 
ple of years. Data from the refurbished Hubble Space 
Telescope, the Keck and other large telescopes are now 
providing quantitative information on essential properties 
of the galaxy population - number densities, luminosities, 
colours, morphologies and star formation rates - over a 
large span of cosmic time. These data are beginning to 
sketch out an empirical picture of galaxy formation and 
evolution from redshift z ~ 4 to the present. 

Evolution has now been established and quantified in: 
(i) the neutral hydrogen and metal content of the universe 
since z ~ 4 (Lanzetta, Wolfe & Turnshek 1995, Storrie- 
Lombardi et al. 1996, Wolfe et al. 1995, Lu et al. 1996); 
(a) the galaxy luminosity function since z ~ 1 (Lilly et al. 
1995; Ellis et al. 1996); (in) the morphology of field and 
cluster galaxies since z ~ 0.8 (e.g. Abraham et al. 1996, 
Dressier et al. 1994, Small et al. 1997). The most recent 
addition to this remarkable list of observational advances is 
the discovery of a large population of actively star-forming 
galaxies z ~ 3, identified by their redshifted Lyman con- 



tinuum breaks (Steidel et al. 1996a, hereafter S96). It is 
this population that we are primarily concerned with in 
this paper. 

One of the earliest windows on the physical processes 
at play in galaxy formation is provided by studies of pri- 
mordial gas clouds detected in absorption against back- 
ground quasars. The comoving density of neutral hydro- 
gen present in damped Lyman-alpha clouds peaks at z ~ 3 
when it was comparable to the mass in baryons seen in 
galactic disks today (Storrie-Lombardi et al. 1996). The 
decline in the abundance of neutral hydrogen clouds seems 
to be accompanied by a gradual build-up of their metal 
content (Lu et al. 1996). A population of bright galaxies 
is certainly well established by z = 1 (Lilly et al. 1995, El- 
hs et al. 1996, Kauffmann, Chariot & White 1996). In the 
CFRS survey of Lilly et al. , evolution is manifest in a sys- 
tematic variation of the shape of the luminosity function 
of blue galaxies and a brightening of their characteristic 
luminosity with increasing lookback time. The luminosity 
function of red galaxies, on the other hand, seems to have 
changed little over this redshift interval, although the frac- 
tion of galaxies that have the colours of passively evolving 
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ellipticals appears to fall to one third of its present day 
value by z = 1 (Kauffmann et al. 1996). 

On the whole, galaxies appear to be smaller and in- 
creasingly irregular at higher redshift (Driver et al. 1995, 
Glazebrook et al. 1995a, Abraham et al. 1996, Small et 
al. 1996, Odewahn et al. 1996, Pascarelle et al. 1996, 
Lowenthal et al. 1997.) For example, the class of "irreg- 
ular/merger" galaxies which are relatively rare at bright 
magnitudes makes up about a third to a half of all galax- 
ies with Iab — 25. The median redshift at this apparent 
magnitude (the faintest at which automated morphological 
classification is possible on high resolution HST images) is 
z ~ 0.8. Similarly, the fraction of spirals in rich clusters 
at z ~ 0.5 is higher than in present day clusters (Dressier 
et al. 1994). All these studies leave little doubt that the 
galaxy population has evolved significantly since z — 1. 

The discovery of a large population of Lyman-break 
galaxies at z > 3 provides the first opportunity for sta- 
tistical studies of evolutionary processes in galaxies be- 
yond z — 1. Steidel and collaborators (Steidel & Hamil- 
ton 1992, 1993; Steidel, Pettini & Hamilton 1995; S96) 
searched for high redshift galaxies by selecting objects in a 
colour-colour plane constructed from images in customised 
Un, G and TZ band filters. For galaxies in the redshift 
range 3.0 < z < 3.5, the Lyman limit discontinuity passes 
through the [/„ filter. Opacity due to intervening neutral 
hydrogen increases the strength of the Lyman discontinu- 
ity regardless of the shape of the intrinsic spectral energy 
distribution of the galaxy (Madau 1995). Thus, a galaxy 
in this redshift range will be faint in the band (thus 
becoming a "UV-dropout" ) and so will have a very red 
Un — G colour, whilst possibly having a blue G — TZ colour 
if it is undergoing significant star formation. A similar 
strategy has been successfully implemented in the Hubble 
Deep Field (HDF) by Steidel et al. (1996b) and Madau 
et al. (1996). The HST U fiher has a shorter median 
wavelength than the [/„ filter of the ground-based obser- 
vations, and so colour selected objects in the HDF span a 
wider range of redshifts from 2 ^ z ^ 4.5. 

Follow-up spectroscopy of the UV drop-out candidates 
by S96 on the Keck telecope confirmed that these galaxies 
lie mostly in the expected redshift range, 3.0 < z < 3.5. 
Their spectra resemble those of nearby starburst galax- 
ies. From the apparent 7?.-band magnitude, a dust-free 
model for the spectral energy distribution in the UV, 
and an assumption about the initial stellar mass function 
(IMF), S96 inferred star formation rates in these galax- 
ies in the range 1 — 6/i~^M0yr~^ for a critical density 
universe, where we have expressed Hubble's constant as 
Hq = 100ft-kms~^Mpc^^. Similarly low star formation 
rates have been inferred by Lowenthal et al. (1997) for 
11 galaxies in the HDF at z = 2 — 4.5. From the width 
of saturated interstellar absorption lines, S96 inferred ten- 
tative one-dimensional velocity dispersions in the range 
aiD = 180 — 320km s^^. They concluded that the Lyman- 
break galaxies they discovered could be the progenitors of 
the spheroidal components of present day galaxies. 

The current state of empirical knowledge on galaxy 
formation has been nicely summarized by Madau et al. 
(1996) and Madau (1996) in the form of a "cosmic star 
formation history." Combining a variety of surveys (in- 
cluding the CFRS and the Lyman-break galaxy surveys). 



they derived metal production and star formation rates as 
a function of redshift, from z = to z ~ 5. Observed 
star formation rates over this redshift range are typically 
a few solar masses per year for individual galaxies. The 
integrated star formation rate never differs by more than 
an order of magnitude over this entire redshift range, al- 
though a peak of activity seems to have occured at z ~ 1. 
The total amount of metals produced by the observed pop- 
ulations is comparable to the amount of metals seen in 
massive galaxies today, suggesting that the bulk of the 
cosmic star formation has now been identified. 

In this paper we employ the semi-analytic model devel- 
oped in a series of earlier papers (Cole et al. 1994; Heyl 
et al. 1995; Baugh et al. 1996a, 1996b), to investigate the 
significance of the Lyman-break galaxy population within 
the context of hierarchical clustering theories of galaxy 
formation. We consider the circumstances under which 
such a population may form and we focus on the connec- 
tion between these high redshift objects and galaxies seen 
in various evolutionary stages at lower redshifts. We use 
the available data to test in detail our earlier theoreti- 
cal predictions for the way in which galaxies are built up 
from small fluctuations in a universe dominated by cold 
dark matter (CDM). Specifically, we test the prediction of 
White & Frenk (1991), Lacey et al. (1993) and Cole et 
al. (1994) that the bulk of the stars present in galaxies 
today formed relatively recently, with a median redshift of 
star formation of only z ~ 1. The results of these compar- 
isons are very encouraging and suggest a general picture 
of galaxy formation and evolution which is consistent with 
the expectations from a broad class of hierarchical cluster- 
ing cosmologies. 

A more basic attempt to investigate whether the abun- 
dance of S96 Lyman-break galaxies is consistent with cold 
dark matter models was recently carried out by Mo & 
Fukugita (1996). They used the Press & Schechter (1974) 
formalism to calculate the number density of halos with 
velocity dispersion in excess of uid = 180kms~^, making 
simple assumptions about the time required for a galaxy 
to form in each halo. They concluded that a range of 
low-density COBE-normalized CDM models are compati- 
ble with the data. 

Our semi-analytic galaxy formation scheme is briefly re- 
viewed in Section |^, where we discuss our procedure for 
generating mock catalogues of high redshift galaxies. The 



colour selection criteria of S96 are reviewed in Section 3 
and the abundance of galaxies that meet these constraints 



in a variety of cosmological models is given in Section 3.2 
The expected properties of high redshift galaxies - masses, 
star formation rates, sizes, clustering, etc - are presented 
in Section p.3|. Our models predict the entire evolution 



ary history of galaxies and so, in Section |[ we illustrate 
the eventual fate of a few high redshift examples and ex- 
amine the statistical properties of the descendants of the 
Lyman-break objects. In Section ||, we recast the Cole et 
al. (1994) predictions for the cosmic star formation history 
in a manner that is directly comparable to the Madau et 
al. (1996) data, and we also compare our predicted evolu- 
tion in the neutral gas fraction with observations. Finally, 
we present our conclusions in Section |^. 
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2. SEMI-ANALYTIC MODELLING OF GALAXY FORMATION 

2.1. General description of the model 

Semi-analytic modelling is a novel technique for calcu- 
lating ah initio the evolutionary properties of galaxies in 
cosmological models in which structure forms hierarchi- 
cally. The growth of dark matter halos by accretion and 
mergers is followed statistically, while physically motivated 
rules are used to describe the cooling of gas in these halos, 
the transformation of cold gas into stars, and the effects 
of feedback from massive stars on the dynamics of the gas. 
The spectrophotometric properties of the stars that form 
are calculated from a spectral synthesis model. The ba- 
sic physical concepts, mathematical formalisms, and first 
applications of semi-analytic modelling are presented in 
White & Rees (1978), Cole (1991), Lacey & Silk (1991), 
White & Frenk (1991), Kauffmann, White & Guiderdoni 

(1993) , and Cole et al. (1994). The technique has now 
been successfully applied to a variety of problems in galaxy 
formation (Lacey et al. 1993; Kauffmann, Guiderdoni & 
White 1994; Heyl et al. 1995; Kauffmann 1995, 1996a,b 
and Baugh et al. 1996a,b, Frenk et al. 1996, 1997). 

The basic rules that govern the physical processes in 
the galaxy formation scheme adopted in this paper are 
presented in detail in Cole et al. (1994). In brief, when a 
dark matter halo collapses, its associated gas is assumed 
to be shock-heated and to settle into quasistatic equilib- 
rium at the virial temperature of the halo. This gas cools 
radiatively over the lifetime of the halo and cold gas is 
turned into stars at a rate proportional to the instanta- 
neous cold gas mass. Feedback from supernovae and stel- 
lar winds returns some of the cold gas to the hot phase, 
strongly inhibiting star formation in low circular velocity 
halos. During a merger, the dark matter halos coalesce, 
but the galaxies within them can survive longer, eventually 
merging on a timescale related to the dynamical friction 
time. 

For this analysis, we have upgraded the Cole et al. 

(1994) model in various ways. The main modification is 
the replacement of the "block model" (Cole & Kaiser 1988) 
as the description of the merger history of dark matter ha- 
los. In the new scheme we use a Monte Carlo method based 
on the analytical expression for the halo progenitor mass 
function derived from the "extended Press-Schechter the- 
ory" (Bond et al. 1991; Bower 1991; see also Lacey & Cole 
1993) to generate binary merger trees. Each tree describes 
a possible merger history for a halo of specified final mass. 
At each branch in the tree a halo splits into two progen- 
itors, but unlike in the "block model," the mass ratio of 
the two progenitors can take any value. This technique 
enables the merger process to be followed with high time 
resolution, as timesteps are not imposed on the tree but 
rather are controlled directly by the frequency of mergers. 
It is similar in spirit to the method used by Kauffmann et 
al. (1993), but has several advantages, including that it 
does not require the storage of large tables of progenitor 
distributions. The new merger scheme is fully described 
in Lacey & Cole (in preparation). 

A further modification to the scheme is that the sin- 
gular isothermal sphere model adopted by Cole et al. as 
a description of the dark matter halo density profile has 
been replaced by the analytical form proposed by Navarro, 
Prenk & White (1996) on the basis of high resolution N- 



body simulations. In the original Cole et al. model all 
the gas that could cool over the entire life of the halo was 
assumed to be available to form stars from the beginning 
of the halo lifetime. We now estimate the supply of cold 
gas available to form stars by calculating the cooling rate 
at a series of discrete timesteps in which successive shells 
of gas can cool. 

In our scheme, a cosmological model is specified by an 
assumption about the nature of the dark matter together 
with values for the cosmological parameters: the mean 
cosmic density (^q), the cosmological constant (Aq), Hub- 
ble's constant {Hq = 100/ikms~^Mpc~^), the rms mass 
fluctuations in spheres of radius 8ft."^Mpc ((Ts), and the 
mean baryon density in units of the critical density (Ob). 
Our galaxy formation prescription requires specifying 6 
physical parameters: (i) a star formation timescale (tq), 
(ii) a "feedback parameter," (iii) the shape of the initial 
mass fimction (IMF) of stars, (iv) an overall luminosity 
normalisation given by the ratio of the total mass in stars, 
including brown dwarfs, to the mass in luminous stars, 
(v) a merger timescale for galaxies, and (vi) the threshold 
mass for a galaxy merger to turn a disk into a spheroid 
(see Cole et al. 1994 and Baugh et al. 1996b for further 
details.) 

The general strategy that we have adopted in this and 
previous papers (Baugh et al. 1996ab, Frenk et al. 1996, 
1997), is to fix the first 5 basic parameters of the model 
to obtain the best possible match to the local B-band and 
K-band galaxy luminosity functions and the sixth param- 
eter to reproduce the local relative abundances of ellipti- 
cals, SO's and spirals. It turns out that these requirements 
severely restrict the allowed range of parameter values, 
except for the IMF which in any of the commonly used 
forms (ie. Salpeter (1955), Miller-Scalo (1979) or Scalo 
(1986)) has little effect on the predicted local luminosity 
function. The evolution of the characteristic luminosity, 
i*, however, is sensitive to the choice of IMF, which there- 
fore affects predictions for the counts of faint galaxies (see 
Cole et al. 1994). For the most part, we have adopted 
the same values of the parameters as used in the fiducial 
model of Cole et al., allowing ourselves the freedom to use 
different forms for the IMF. We again assume that feed- 
back is a strong function of the halo circular velocity. The 
two exceptions are that we have slightly reduced the star 
formation timescale, Tq, from 2 to 1.5 Gyr and we have 
doubled the ratio of the galaxy merger timescale to the 
dynamical time in the halo. The former change leads to 
an abundance of Lyman-break galaxies in better agree- 
ment with the data (c.f. § 3.2) while the latter change 
compensates for minor differences introduced by our new 
Monte-Carlo scheme for the halo merger trees. With this 
choice of parameters, our new model produces luminos- 
ity functions that are very similar to those published by 
Cole et al. We have updated the original Bruzual-Charlot 
stellar population synthesis model with their new version 
(also for solar metallicity only; Bruzual & Chariot 1993, 
Chariot, Worthey & Bressan 1996.) 

Fixing the model parameters by reference to a small sub- 
set of the data produces a fully specified model which can 
then be tested against other data, particularly high red- 
shift data. Thus specified, our model has predictive power 
and we have presented a number of specific predictions in 
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earlier papers. Two of these are particularly relevant to 
the present discussion. The first concerns the redshift dis- 
tribution of a survey of faint galaxies limited to magnitude 
B = 24 (see figure 20 of Cole et al. 1994). Data have now 
been obtained by Glazebrook et al. (1995b) and by Cowie 
et al. (1996). Our model predictions are in good agree- 
ment with these data, as may be seen in Fig. 1 of Frenk 
et al. (1997) (see also Fig. 15 of White & Frenk 1991 and 
Kauffmann, Guiderdoni & White 1994). The second pre- 
diction to which we will return in Section 5 of this paper, 
is the cosmic star formation history, presented in Figure 21 
of Cole et al. (1994) and in Fig. 14 below. 

2.2. Modelling galaxies at high redshift 

We begin by constructing merger trees for a grid of 
halo masses, specified at some redshift Zhaio- Typically, 
we generate between 5 and 20 diflferent realizations for 
each mass, depending upon the Press-Schechter abun- 
dance. The galaxy formation rules are applied along the 
branches of each halo tree, starting at the highest red- 
shift of interest and propagating through to ^haio- Volume- 
limited samples or redshift catalogues are generated from 
the model output by weighting the galaxies in a halo tree 
of a given mass by its predicted Press-Schechter abundance 
at Zhaio- Mock catalogues consisting of galaxies selected 
according to any colour-magnitude criteria can be readily 
generated from the model output. 

To construct mock catalogues with the selection crite- 
ria of S96, we calculate broad band colours for the model 
stellar populations using the same set of customised filters 
employed by S96. (The filter functions were kindly pro- 
vided by C. Steidel.) The dominant effect that determines 
whether a galaxy is a UV dropout is absorption of the 
galactic UV light by intervening cold gas. We calculate 
the efi'ect of absorption on the spectral energy distribu- 
tions (SEDs) of high redshift galaxies using the procedure 
developed by Madau (1995). In this way, we are able to 
select model galaxies according to exactly the same colour 
criteria as applied to the observational data by S96. 

For most of this study we have generated trees start- 
ing at Zhaio = 2.6. A high starting redshift is desirable 
in order to minimize inaccuracies at the high mass end of 
the mass distribution of progenitors introduced when the 
Monte-Carlo scheme is applied over a large range of ex- 
pansion factor (Lacey & Cole, in preparation). We have 
checked, however, that our results are insensitive to the 
exact choice of ^haio- To obtain the expected total number 
of galaxies in the apparent magnitude range observed by 
Steidel et al., 7?.ab < 25.0, we also generated mock galaxy 
samples from a grid of halo masses laid down at ^haio = 0. 
At TZab < 25.0, the median redshift is z ~ 0.7. 

3. MODEL RESULTS 

In this section we investigate the properties of the 
Lyman-break galaxies that form in our models and, wher- 
ever possible, we compare these to the properties of the 
real Lyman-break galaxies discovered by S96. Specifically, 
we consider the number density, masses, star formation 
rates and sizes of these galaxies and we present predictions 
for their clustering properties. To fully specify a model we 
need to adopt values for the cosmological parameters, flo, 



Aq, h and erg, and also a value for the baryon fraction, 
fib, and an IMF. There are considerable uncertainties in 
these choices. We carry out calculations in three difi'erent 
cosmological models: the standard CDM model (Oq = 1) 
h = 0.5, as = 0.67); a flat, low-density CDM model 
(fio = 0.3, Ao = 0.7, h = 0.6, = 0.97) and an open 
CDM model (fio = 0.4, h = 0.6, as = 0.68). These pa- 
rameters are typical of those favoured by large-scale struc- 
ture constraints. For example, all of our models produce 
approximately the correct abundance of galaxy clusters at 
the present day and, under standard assumptions, the low- 
density models also match the 4-year COBE microwave 
background anisotropy data (Bennet et al. 1996; Liddle 
et al. 1996; White, Efstathiou & Frenk 1993; Eke, Cole & 
Frenk 1996; Viana & Liddle 1996; Cole et al. 1997.) For 
our standard fio = 1 cosmology, we vary the normalisation 
of the primordial fluctuation spectrum, considering both 
the above value, ag = 0.67, adopted in the fiducial model 
of Cole et al. (1994), and the lower value, as = 0.5, pre- 
ferred by Eke, Cole & Frenk (1996) for consistency with 
the observed cluster X-ray temperature function. 

We consider models with two difi'erent baryon fractions, 
fib/i^ = 0.015 and n^h'^ = 0.030. The first agrees with 
the estimate by Copi, Schramm & Turner (1996) from Big 
Bang nucleosynthesis and the second is consistent with the 
claim of Tytler, Fan & Buries (1996) of a low primordial 
deuterium abundance in gas clouds at high redshift. We 
consider three possibilities for the IMF, all of which are 
consistent with solar neighbourhood data, given the un- 
certainties in its past star formation history: Miller-Scalo 
(1979), Scalo (1986) and Salpeter (1955) (see figure 4 in 
Cole et al. 1994 for the specific parametrizations used). 
The parameters of these models (and of variants consid- 
ered below) are summarized in Table 1. 

3.1. Two-colour selection 

We first consider the broad-band colours of our model 
galaxies and test the assumption that high redshift galax- 
ies can be efficiently identified from their location in the 
Un — G vs G — TZ colour-colour plot constructed by S96. 
The analysis of Steidel, Hamilton & Pettini (1995) suggests 
that galaxies with redshifts in the range 3.0 < z < 3.5 
should lie within the trapezium bounded by the dashed 
lines in Fig. 1. At these redshifts, the Lyman break passes 
through the observer's frame Un band. 

Our predicted colour-colour diagram, for the case of 
standard CDM, is shown in Fig. la. The data plotted 
correspond to an area of 14.6 square arcminutes, equal to 
the area of the Q0347-3819 field observed by Steidel, Pet- 
tini & Hamilton (1995). (For clarity the points have been 
given small random displacements in the x and y direc- 
tions. The localizations of the points in bands reflect the 
discrete set of output redshifts.) The open triangles in 
Fig. la indicate galaxies that have redshifts in the range 
3.0<z<3.5. As anticipated by Steidel et al. (1995), 
all galaxies with 3.0 < z < 3.5 lie inside the trapezium in 
Fig. la. However, our models show some contamination 
by galaxies with redshifts in the range 2.7 < z < 3.0 that 
congregate near the bottom-left corner of the trapezium. 
This contamination is, in fact, consistent with the evolu- 
tionary tracks shown in figure 2 of Steidel et al. (1995). 

Arbitrarily deep Lyman breaks cannot be measured in 
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FIG. 1 Colour-colour diagrams for galaxies in the standard CDM model. Galaxies brighter than TZab ~ 25.5 arc plotted; the area 
of the mock field sampled is 14.6 square arcminutes. (a) — G colours computed from the true Un magnitude for all galaxies. 
The open circles show galaxies with redshifts z < 3.0 and the open triangles galaxies with redshifts in the range 3.0 < z < 3.5. 
Model results arc output at specific redshifts and this is reflected in the discrete regions populated in the diagram. The redshifts 
of selected outputs arc indicated, (b) Un — G colours computed by setting the Un magnitude to a detection limit of Un = 26.96 
whenever [/„ is fainter than the detection limit. Galaxies in this class are denoted by filled symbols. As in panel (a), the triangles 
denote galaxies with redshifts in the range 3.0 < z < 3.5. 
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practice because the observations are subject to a detec- 
tion limit in the band. For the Q0347-3819 field, the 
3(7 detection limit is C/n = 26.96. Fig. lb illustrates the 
effect of imposing such a detection limit on the appear- 
ance of the colour-colour plot. Whenever a galaxy has a 
true Un magnitude fainter than the field limit, we plot it 
at the assumed J7n limit using a filled symbol. A signif- 
icant fraction of high redshift galaxies now lie below the 
trapezium region, confirming the remark by Steidel et al. 
(1995) that their selection of candidates is likely to be an 
underestimate of the true abundance. 

Comparison of Fig. lb with figures 4, 7, 10 and 13 of 
Steidel et al. (1995) shows that our models tend to pro- 
duce too few galaxies with ([/„ — G) and (G — TZ) less than 
unity. Galaxies in this part of the observational diagram 
are likely to be predominantly faint foreground dwarf ir- 
regulars. The discrepancy may be due in part to our use 
of solar metallicity stellar populations. This is also appar- 
ent in the comparison between our models and the local 
luminosity function determined by Lilly et al. (1995) from 
the CFRS survey (See figure 16 of Baugh, Cole & Frenk 
1996b.) Beyond z ^ 0.2, however, the luminosity func- 
tions of our models for both red and blue galax;ies agree 
reasonably well with the CFRS data. 



T3 



-I— I— p-i— r 

(a) 

model A 



(b) 



model A 
model G 



2.6 



2.8 



3.2 
z 



3.4 



3.6 



3.8 



FIG. 2 Predicted redshift distributions for galaxies brighter 
than apparent magnitude TZab = 25.0 selected in various ways. 
Panel (a) shows results for our standard CDM model A. The 
solid line shows the distribution for galaxies with redshift in 
the range 3.0 < z < 3.5. The dotted line shows the distribu- 
tion for galaxies that meet the colour criteria of Steidel et al. 
(1995). The dashed line refers to galaxies that meet the Steidel 
et al. (1995) colour selection, after the detection limit in Un 
for a typical observed field has been applied - model galaxies 
fainter than this in Un are assigned this limiting magnitude. In 
panel (b) the dashed histogram is repeated from panel (a) and 
the solid histogram shows the corresponding redshift distribu- 
tion of galaxies brighter than 72-ab = 25 in the low-f2 model G 
brighter than 72.ab = 25 that satisfy the Steidel et al. colour 
selection criteria, taking into account the typical field limit in 
Un- These and subsequent histograms are normalized so that 



they enclose unit area. 

3.2. The Abundance of high redshift galaxies 

Steidel, Hamilton & Pettini (1995) defined a "robust 
candidate" for a Lyman-break galaxy in the redshift range 
3.0 < z < 3.5 to be an object brighter than TZab = 25, 
with Un — G and G — TZ colours in the trapezium region 
of Fig. 1 and which is undetected in the Un band. They 
estimated the surface density of robust candidates to be 

0. 40 ± O.OTarcmin^^, corresponding to 1.3% of their to- 
tal counts brighter than TZab = 25. These counts are 
30 arcmin^^, with a Poisson uncertainty of 2%. Brainerd 
et al. (1995) quote a value of 47 arcmin"^ for the counts to 
the same magnitude. The difference seems to stem from 
different incompleteness corrections and uncertainties in 
the conversion from aperture to total magnitudes. 

We now examine which, if any, of our hierarchical 
clustering models produces an acceptable abundance of 
Lyman-break galaxies. The parameters of the models we 
have investigated are summarized in columns 2-6 of Table 

1 . The seventh column gives the luminosity normalization 
of each model, T, which is defined as the ratio of the total 
mass in stars formed in the model to the mass formed in 
luminous stars i.e. stars with mass greater than O.IMq. 
This parameter is set in all cases to match the knee of 
the observed local field galaxy luminosity function, as de- 
scribed by Cole et al. (1994). 

Table 2 gives the abundance of galaxies predicted by 
our models. Where available the observed values are 
shown in the bottom row of the table. The second col- 
umn, N{Rab < 25.0), gives the total number of galax- 
ies brighter than T^ab = 25. The next three columns 
demonstrate the effect of applying various selection cri- 
teria to this TZab < 25 sample. The third column, 
A/'(3.0 < z < 3.5), gives the number of galaxies per square 
degree with redshift in the range 3.0 < z < 3.5. The fourth 
column, A/'(colour), gives the number of galaxies in the re- 
gion of the colour-colour diagram from which Steidel et al. 
(1995) selected their high redshift candidates. As shown 
in Fig. 1, a significant fraction of these galaxies are at red- 
shifts just below 3. The fifth column, J\f{co\our+Un) , is the 
number of galaxies remaining in the colour selected region 
after the Un < 26.96 magnitude hmit for the Q0347-3819 
field is applied and model galaxies fainter than this have 
had a 3o" lower limit assigned for their Un — G colour. This 
removes roughly half of the high redshift candidates in col- 
umn 4 from the colour selected region. The final column 
gives the fraction of galaxies that meet the colour selec- 
tion criteria, after applying a typical field limit in Un, as a 
percentage of the total counts brighter than TZab = 25.0 

It is clear from Table 2 that the abundance of high red- 
shift galaxies expected in a given cosmology is very sensi- 
tive to the adopted IMF and to erg, the normalisation of 
the primordial power spectrum. For example, replacing 
the Scalo by the Miller-Scalo IMF in our fiducial standard 
CDM model, produces an increase of nearly a factor of 
20 in the number of high redshift galaxies listed in col- 
umn 5. This sensitivity arises because, when normalized 
to the same total mass in luminous stars, the Miller-Scalo 
IMF contains several times more stars of around IOM0 
than the Scalo IMF. Similarly, increasing cts in the Qq = 1 
model from 0.5 to 0.67 increases the number of high red- 
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FIG. 3 Predicted galaxy number counts. In panel (a), the higher amplitude pair of lines shows the total counts in model A (solid 
line) and model G (dotted line) . The lower amplitude pair of lines shows the counts of galaxies in each of these models that satisfy 
the Steidel et al. colour selection criteria alone, without any constraint on their Un magnitudes. In panel (b) the total counts in 
model A are again shown by the solid line. The dotted line now shows the counts of colour selected galaxies; the short dashed line 
shows the counts of galaxies with redshifts in the range 3.0 < z < 3.5; and the long-dashed line shows the counts of galaxies with 
redshifts in the range 4.0 < z < 4.5. 
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Table 1 

Model parameters. Columns 2-6 give the values of the cosmological parameters defined in the text. Column 7 gives, T, 
the ratio of the total mass in stars to the mass in luminous stars. Column 8 indicates the stellar initial mass function 
(IMF) used. 



model 


no 


Ao 


h 




fib 


T 


IMF 


A 


1.0 


0.0 


0.5 


0.67 


0.06 


2.8 


Miller-Scalo 


B 


1.0 


0.0 


0.5 


0.67 


0.06 


2.3 


Scalo 


C 


1.0 


0.0 


0.5 


0.67 


0.06 


1.5 


Salpeter 


D 


1.0 


0.0 


0.5 


0.50 


0.06 


2.5 


Miller-Scalo 


E 


1.0 


0.0 


0.5 


0.67 


0.12 


6.4 


Miller-Scalo 


F 


0.4 


0.0 


0.6 


0.68 


0.04 


2.8 


Miller-Scalo 


G 


0.3 


0.7 


0.6 


0.97 


0.04 


2.4 


Miller-Scalo 


H 


0.3 


0.7 


0.6 


0.97 


0.08 


4.2 


Miller-Scalo 



Table 2 

The abundance of high redshift galaxies per square degree brighter than TZab = 25 for the models listed in Table 1. The 
quoted numbers are derived from 10 realisations of a catalogue each covering 0.1 square degrees. The final row gives 
results from the observational study of S96 



model N{Rab < 25.0) A/ (3.0 <z< 3.5) A/ (colour) A/ (colour + U^) % oiNjTlAB < 25.0) 



A 


153 000 


890 


2000 


1000 


0.7 


B 


108 000 


50 


130 


60 


0.05 


C 


142 000 


1100 


2400 


1300 


0.9 


D 


117 000 


60 


180 


110 


0.09 


E 


152 000 


1100 


2100 


1100 


0.7 


F 


101 000 


1100 


2200 


1000 


1.0 


G 


126 000 


2400 


5100 


2900 


2.3 


H 


158 000 


4200 


8500 


4700 


3.0 


observed 


110 000 






1400 ± 300 


1.3 ±0.3 
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shift galaxies by a factor of 10. This dependency reflects 
the fact tliat the brightest Lyman-break galaxies at z ~ 3 
tend to come from the tail of rare objects in the mass dis- 
tribution at this redshift. With our procedure for normal- 
izing the luminosity of the models, the predicted abun- 
dances arc insensitive to the value of Ob- However, the 
precise value of T does affect the abundances; reducing 
T (at the expense of a poorer match to the local lumi- 
nosity function) would boost the number of high redshift 
galaxies. 

The conclusion to be drawn from Table 2 is that it 
is possible to reproduce the observed number of Lyman- 
break galaxies, 1400 ± 300 per square degree, in a variety 
of cosmological models by reasonable adjustments to the 
input parameters. For example, within the observational 
errors, the standard CDM model produces the required 
abundance if the Miller-Scalo (Model A) or the Salpeter 
(Model C) IMF is assumed. Increasing the characteris- 
tic star formation timescale, Tq, from 1.5 Gyr to 2 Gyr 
decreases the number of Lyman-break galaxies by about a 
factor two. Similarly, the open model F with a Miller-Scalo 
IMF is acceptable. The flat low-f2 models G and H pro- 
duce about 2 to 3 times as many Lyman-break galaxies 
as observed. These abundances would be reduced, how- 
ever, if the Miller-Scalo IMF were replaced by the Scalo 
IMF or if Tq were increased. Model D, the standard CDM 
cosmology with density fluctuations normalised to repro- 
duce the observed abundance of rich clusters (Eke et al. 
1996) produces far too few Lyman-break galaxies. How- 
ever it would be premature to conclude that a standard 
CDM model with this normalization is incompatible with 
the high redshift data. For example a simple modiflcation 
of the Cole et al. (1994) star formation rules, in which 
the timescale Tq is scaled with the dynamical time of the 
galaxy (tq oc (l+z)^^'^) instead of remaining constant with 
redshift, results in an increase of jV(colour+ [/„) from 80 
to 1900, without significant change in the properties of 
galaxies at the present time. We plan to explore the ef- 
fects of such variations in the modelling of star formation 
and feedback in a future paper (Cole et al. in preparation) . 
Note that not all the models listed in Table 2 reproduce the 
total counts of galaxies brighter than T^ab = 25 quoted by 
Steidel et al. (1995). However, as mentioned above, these 
counts may be uncertain by a significant factor. 

Our predicted redshift distribution for Lyman-break 
galaxies after the various selection criteria have been ap- 
plied is shown in Fig. 2. Only galaxies brighter than 
7?.AB = 25 are included in this plot. As was evident from 
Fig. 1, the Steidel et al. (1995) colour criteria allow a sig- 
nificant population of galaxies with redshifts just below 3. 
Introducing a Un-hand detection limit biases the sample 
against the highest redshift galaxies whose light experi- 
ences the most absorption by intervening cold gas, skewing 
the distribution of robust candidates towards z ^ 3. The 
top panel of Fig. 2 shows results for our standard CDM 
model A and the bottom panel compares these with re- 
sults from the flat low-0 model G. The two distributions, 
heavily constrained by the selection criteria, are very sim- 
ilar. 

Fig. 3 shows the number counts of galaxies predicted in 
two of our models. In Fig. 3a, the solid lines refer to model 
A and the dotted lines to model G. The higher amplitude 



pair of curves gives the total number counts in these two 
models while the lower amplitude curves give the number 
of objects that satisfy the Steidel et al. Un — G, G — TZ 
colour criteria. The fraction of the total counts represented 
by Lyman-break galaxies increases rapidly with increasing 
magnitude. The counts in model A are shown in more de- 
tail in Fig. 3b. Again, the high amplitude solid curve gives 
the total counts while the dotted line shows the number 
of galaxies that satisfy the colour selection criteria. The 
counts of galaxies with redshifts in the range 3.0 < z < 3.5 
arc indicated by the short-dashed line and the counts at 
4.0 < z < 4.5 are shown by the long dashed line. The 
latter are lower by about one order of magnitude. 

Steidel et al. (1996b) and Madau et al. (1996) have 
applied a similar technique to isolate high redshift galax- 
ies in the Hubble Deep Field (Williams et al. 1996). The 
HDF was imaged in four passbands and so two-colour se- 
lection can be applied to select galaxies that 'drop out' in 
two passbands, U300 and -B450. The U^on dropouts are pre- 
dicted to lie in the redshift range 2.0 < z < 3.5 whilst the 
B450 dropouts should have redshifts between 3.5 < 2; < 4.5 
(Madau etal. 1996). We have made mock HDF catalogues 
from the output of our model, using the same filters and 
applying the detailed colour selection given by Madau et 
al. A comparison of the abundance; of high redshift objects 
with the inferred abundances for the HDF is given in Ta- 
bles 3 and 4. In Table 3 we consider galaxies brighter than 
Vgoe = 28.0 and B450 = 26.8, while, in Table 4, we consider 
galaxies brighter than Veoe = 27.7. These data lead to a 
similar conclusion as the data in Table 2: several of our 
models (as did the some of the more successful models of 
White & Frenk 1991) predict approximately the observed 
abundance of high redshift galaxies thoughout the redshift 
range 2.0 < 2; < 4.5. Our predicted abundances are sensi- 
tive to the IMF assumed - model B with a Scalo IMF gives 
seven times fewer B450 dropouts compared with model A 
which uses a Miller-Scalo IMF. 

3.3. Properties of high redshift galaxies 

In this Section we consider the properties of galaxies in 
models A and G that satisfy the Steidel et al. colour se- 
lection criteria and that are brighter than TZab = 25.0; we 
do not apply any conditions on their Un magnitudes. 

3.3.1. Dark matter halos 

The masses of the dark matter halos that harbour 
Lyman-break galaxies brighter than 7?.ab = 25.0 and the 
circular velocities of the halos in which these galaxies 
formed are plotted in Fig. 4. The solid lines correspond 
to the standard CDM model A, and the dashed lines to 
the flat, low-0 model G. The halo masses plotted in the 
top panel are remarkably similar in the two cosmologies. 
This is largely a coincidence arising from the interplay be- 
tween the selection criteria imposed on the galaxies and 
the overall halo mass distributions in the two cosmologies. 
The circular velocities plotted in the bottom panel are also 
similar, with a shift towards lower values in the low-f2 cos- 
mology. 

S96 estimated velocity dispersions for the Lyman-break 
galaxies from the widths of heavily saturated interstel- 
lar absorption lines. As they point out, these measure- 
ments may be contaminated by turbulent motions in the 
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Table 3 

Comparison of the abundance of high redshift galaxies in models A and G with the abundances inferred from the Hub- 
ble Deep Field by Madau et al. (1996). Madau et al. estimate that the f/soo-band dropouts lie in the redshift range 
2.0 < z < 3.5 and the -B45o-band dropouts in the redshift range 3.5 < 2; < 4.5. This Table refers to C/300-band dropouts, 

whilst Tabic 4 refers to _B45o-band dropouts. The values in the table give the number of objects per square degree. The 
final column gives the number of colour selected objects as a percentage of the total number of objects in the sample. 



model 


Mtot{B450 < 26.8) 


A/ (2.0 <z< 3.5) 


N (colour) 




A 


360 X 10« 


54 X 10^ 


66 X 10^ 


18 


G 


290 X 10^ 


65 X 10^ 


79 X 10'"' 


27 


observed 


(320 ± 16) X 10^ 




(46 ± 6) X 10^ 


14 



Table 4 

Comparison of the abundance of high redshift galaxies in models A and G with the abundances inferred from the Hubble 
Deep Field by Madau et al. (1996) for i345o-band dropouts. 



model 


A/tot(V606 < 27.7) 


A/ (3.5 <z< 4.5) 


AA(colour) 


% of Mot 


A 


790 X 10^ 


9.9 X 10^ 


5.9 X 10^ 


0.7 


G 


620 X 10-^ 


16 X 10^ 


9.4 X 10-'' 


1.5 


observed 


(620 ±20) X 10^ 




(9.3 ±2.5) X 10^ 


1.5 



gas. Alternatively, they may be due entirely to gravi- 
tationally supported random motions and, in this case, 
their measurements indicate velocity dispersions in the 
range ctid — 180 — 320 km s^^, corresponding to circu- 
lar velocities Vc = \f2a\u = 250 — 450kms~^. If the 
line widths are due to rotational motion in a disk of con- 
stant circular velocity, V^^ then the observed range of full- 
width at half-maximum, 400 — 700 kms"-'^, corresponds to 
Vc ~ 250 — 430kms~^ for randomly oriented disks. Our 
model predictions in both cosmologies, illustrated in Fig. 
4, are consistent with these numbers. Note, however, that 
the circular velocities plotted in the Figure are asymptotic 
halo values. The actual circular velocity is a function of 
radius. This, as well as the redistribution of mass asso- 
ciated with the formation of the galaxy, will affect what 
can be measured observationally. In principle, these can 
be substantially different from the asymptotic halo values. 
We intend to explore this issue in detail in a subsequent 
paper. 
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FIG. 4 Masses and circular velocities of the dark matter 
hales that harbour Lyman-break galaxies. Galaxies brighter 
than T^AB = 25.0 satisfying the colour criteria of Steidel et al. 
(1995) are included. The solid lines show the distributions for 
the standard CDM model A, whilst the dotted lines show dis- 
tributions for the flat low-f2 model G. The top panel gives the 
distribution of halo masses, and the bottom panel the distri- 
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bution of circular velocities for the halos in which each galaxy 
formed. 

3.3.2. Stellar masses and st,ar form,ation rates 

The stellar masses of our model Lyman-break galaxies 
brighter than TZab = 25.0 are plotted in Fig. 5. As before, 
the solid Une shows results for the Qq = 1 model A and the 
dotted line for the flat, low-0 model G. The stellar masses 
are typically three times larger in the low-fi cosmology. 
This difference is the result of the selection criteria im- 
posed on these galaxies: luminosity distances are larger 
in the low density model, so galaxies selected at a given 
apparent magnitude limit must have larger luminosities, 
and thus larger stellar masses. 

An indication of the stellar masses of real Lyman-break 
galaxies comes from if -band imaging of 5 candidate UV 
dropouts carried out by S96 at the Keck telescope. For 
these 5 candidates, they find Kab = 23.2 — 24.0, and 
colours 0.4 < (7?.AB — Kab) < 1-3. Galaxies in our mod- 
els have K magnitudes in the range Kab = 22 — 24, with 
colours in the range 0.5 < 7?.ab — Kab < 1-4, in excellent 
agreement with the data, as shown in Fig. 6. 



tion rates in typical galaxies at z ~ 3 are comparable to 
their mean rates averaged over the age of the universe at 
that redshift (1.6 Gyr in model A, 2.5 Gyr in model G). 
We consider the history of star formation in more detail 
in Section 5. 

Whereas the star formation rate of a galaxy is not di- 
rectly observable, the distribution of TZ magnitudes is. 
Since the TZ band samples the rest frame ultraviolet at 
2; ~ 3, the distribution of TZ magnitudes is closely related 
to the distribution of star formation rates. In Fig. 7b we 
plot the distribution of absolute magnitude Mij(AB) in 
our models. This distribution, however, is not a particu- 
larly strong constraint on the models because, by design, 
the S96 survey covers only a narrow range in 7?.ab- We 
defer a detailed comparison between our predicted star 
formation rates and observations to Section 5. 



1 1 1 1 


r 






1 1 1 1 1 1 1 1 1 

model A 


1 1 1 _1- 


1 




1 


model G 



9 10 11 

log(M,/h-' MJ 



FIG. 5 The stellar mass distribution of Lyman-break galax- 
ies. Galaxies brighter than TZab = 25.0 satisfying the colour 
criteria of Stcidel ct al. (1995) arc included. The fio = 1 
model A is shown by the solid line and the low-f2 model G by 
the dotted line. 



The distribution of star formation rates in our model 
Lyman-break galaxies is shown in Fig. 7a. These are 
instantaneous rates, measured directly from the mass of 
cold gas turned into stars per unit time. The star for- 
mation rates at z ~ 3 are typically a few solar masses 
per year, and are somewhat larger in the flat low-fJ model 
than in the standard model. Only a very small fraction 
of the galaxies at these redshifts have star formation rates 
in excess of lO/i~^M0yr~^. The instantaneous star forma- 



3.3.3. Galaxy sizes 



An essential feature of hierarchical models of galaxy for- 
mation is that the typical sizes of galaxies increase with 
time. Thus, we expect the characteristic radii of galax- 
ies to be considerably smaller at high redshift than they 
are at present. A detailed investigation of the evolution 
of galactic sizes will be presented elsewhere (Lacey et al. 
in preparation). A rough indication of the sizes of high 
redshift galaxies, however, may be obtained from a simple 
model assuming that galaxies acquire their angular mo- 
mentum from tidal torques and that the angular momen- 
tum of the gas is conserved as it condenses within its halo. 
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FIG. 6 The predicted {TZ — K)ab colour distribution of 
galaxies brighter than TZab = 25.0 satisfying the Steidel et al. 
colour selection. The top panel shows model A and the lower 
panel shows model G. 




-22 

M„(AB) - 5 log h 



24 



FIG. 7 Star formation rates and absolute TZab magnitudes 
for high redshift galaxies. Galaxies with apparent magnitudes 
7?.AB < 25.0 satisfying the colour criteria of Steidel et al. (1995) 
are included. Model A is shown by the solid line and model G 
by the dotted lino. The top panel gives the distribution of in- 
stantaneous star formation rates in the models and the bottom 
panel the distribution of absolute magnitudes, M-r,. 



This simple model is quite adequate because, as we dis- 
cuss in Section 4, high redshift galaxies in our model tend 
to have very small bulges or no bulge at all. From this 
simple model, we obtain half light radii th ^ 0.4/i~^kpc at 
2 ~ 3 in model A and rh ^ 0.6ft-~^kpc in model G. This 
rough calculation agrees reasonably well with the values 
of rh ~ 0.7 - 1.0/i-ikpc (f7o = 1) or n, ~ 1.0 - LS/i-^kpc 
(f2o = 0.3, Ao = 0.7) measured by Giavalisco, Steidel & 
Macchetto (1996) from HST follow-up observations of S96 
Lyman-break galaxies. 



3.3.4. Clustering properties 

We calculate the expected clustering of high redshift 
galaxies in two basic steps. First, we calculate the non- 
linear power spectrum P{k, z) of fluctuations in the mass 
distribution in comoving coordinates, using the approx- 
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imate linear to non-linear transformation of Peacock & 
Dodds (1996). Next, we obtain a bias parameter for the 
galaxies using the prescription of Mo & White (1996) . This 
gives the bias of dark matter halos of mass M at redshift 
z as 



b{M,z) ^1 + 



1 



D{z)5,{z) 



- 1 



(1) 



where a{M) is the rms linear density fluctuation at z = 
in a sphere of mass M, D{z) is the linear growth factor, 
and Sciz) is the extrapolated critical linear overdensity for 
collapse at redshift z. 




2 3 
log(r/'h"' kpc) 



CO 

'all 





..0 2 2.5 

log (S/'arc sees) 

FIG. 8 The clustering of the colour selected high redshift 
galaxies in our models. Panel (a) shows the predicted spatial 
correlation functions in comoving co-ordinates and panel (b) 
the predicted angular correlation functions. The heavy and 
light lines correspond to correlations computed from the non- 
linear power spectrum in models A and G respectively, after 
multiplication by the bias parameter of the halos: for model 
A, 6 = 4.2 and for model G, b — 3.5. The solid lines show the 
scales at which the assumption of a constant bias parameter is 
expected to be valid. The extrapolation of a constant bias to 
scales smaller than this is shown by broken lines. The arrow 
in (a) at r = 14/i~^kpc marks the comoving scale represented 
by 1 arcsecond at z = 3. The second arrow, at r = l.l^~^Mpc 
indicates the comoving size of a region that collapses to form 
an object with the median halo mass found for the colour se- 
lected galaxies in model A. The angular scale corresponding to 
this comoving length, 70 arcseconds in model A, is shown by 
the arrow in (b). 



In the model, the majority of the Lyman-break galaxies 
are the central galaxy in their dark matter halo. A good 
approximation to the galaxy bias function is the mean 
halo bias, b, weighted according to the mass distribution 
of halos that harbour galaxies satisfying the Steidel et al. 
(1995) colour selection criteria (c.f. Fig. 4). The power 
spectrum of the Lyman-break galaxies is then given by 
Pgai(fc, z) — b'^P{k, z). From equation (|l|), we find b = 4.2 
in model A and b — 3.5 in model G, making the approxi- 
mation that the halos all lie at the median redshift Zm of 
the colour selected sample. 

The spatial correlation function ^(r) of the high red- 
shift galaxies is obtained by taking the Fourier transform 
of the power spectrum Pgai(/c, z). The angular correlation 
function w{6) can be calculated using the relativistic ver- 
sion of Limber's (1954) equation in the form derived by 
Baugh & Efstathiou (1993). We make the approximation 
that the evolution of clustering and the bias parameter 
are negligible over the narrow range of redshifts in which 
Lyman-break galaxies satisfying the S96 criteria are found, 
so that Pgai(A:,z) ~ Pgai(fc, ^m) in the integral. The rela- 
tivistic Limber equation can then be written as: 



kPg^i{k,z^)g{k0) dfc. 



The kernel function is given by: 



(2) 



(3) 



where x is the comoving distance to redshift z. The term 
F{x) comes from the metric and depends on the cosmology 
(e.g. Peebles 1980 §56); in a flat universe F{x) = 1. The 
redshift distribution dN/dz is that of the colour-selected 
galaxies, and N is the total number of galaxies selected. 
The spatial and angular correlation functions for galaxies 
satisfying the Steidel et al. selection are shown in Fig. 8a 
and Fig. 8b. 

The derivation of the formula for the halo bias, equa- 
tion P), by Mo & White (1996) formally assumes that the 
correlation function of the matter satisfies ^m(?') ^ 1 and 
that r ^ rL/2, where tl = {?>M / ^-KpoY^^ is the comov- 
ing Lagrangian radius of the halos (po being the present 
mean density). The model has been tested against N- 
body simulations by Mo & White and by Mo, Jing & 
White (1996), who find that the formula works quite well 
in practice down to where r « tl, even when i,in{r) > 1. 
For the Lyman-break galaxies in model A, tl ~ l/i~^Mpc 
(comoving), corresponding to 6 ^ 70" for z — 3. Coinci- 
dentally, this is close to the scale where £,m{r) = 1. On 
smaller scales, the halo correlation function should flat- 
ten relative to the matter correlation function. In Fig. 8 
we have therefore plotted the galaxy correlation function 
as a dashed curve on scales r < tl, where the assump- 
tion of constant bias probably breaks down. The spatial 
correlations in the range 0.3 < r < 3/i~^Mpc (comoving) 
are a good match to a power-law, ^(r) = {to/t)^ i with 
7 = 1.8 and tq = 3.9/i~^Mpc. Thus, our models predict 
that Lyman-break galaxies at 2: = 3 should have a comov- 
ing clustering length comparable to that of bright galaxies 
today. 
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We sec that the models predict appreciable angular cor- 
relations for the Lyman-break galaxies, w{0) w 0.1 at 
9 = 100" for model A, and a slightly lower value for 
model G. In contrast, Brainerd et al. (1995) estimated 
w{d) « 0.005 at the same angular scale for field galaxies 
with TZab <; 25. The larger iv{9) that we predict is the 
result of several effects: (i) the relatively narrow redshift 
range for the Lyman-break objects, which reduces projec- 
tion effects in w{9); (ii) the large degree of bias, 6 ~ 4, 
which results from the galaxies occuring in rare dark ha- 
los; and (iii) possible differences in the R magnitude scale 
between the S96 and Brainerd et al. datasets. 

4. THE FATE OF HIGH-Z GALAXIES 

In Section 3 we showed that a population of star forming 
galaxies with the abundance and global properties of the 
Lyman-break galaxies discovered by S96 arise naturally 
in hierarchical clustering theories. Models with a range 
of values of the cosmological parameters arc equally suc- 
cessful in accounting for this population of high redshift 
galaxies. In this and the following section we consider the 
role that this population plays in the general scheme of 
galaxy formation. First we ask the question: what do the 
Lyman-break galaxies evolve into? Are they, as S96 con- 
jectured, the progenitors of the spheroidal components of 
present day galaxies? Our semi-analytic modelling tech- 
nique provides an ideal tool to answer this sort of question 
since the entire evolutionary history of a model galaxy is 
readily available. 

We begin by displaying graphically the evolutionary 
paths of a few examples. From the present day popula- 
tion in the niodc;l, we have chosen four examples of dif- 
ferent morphological types with at least one progenitor at 
z 3 that satisfied the S96 colour and magnitude selec- 
tion criteria. The tree plots in Fig. 9 illustrate the star 
formation histories of these examples. The stellar mass of 
the final, present day, galaxy is given at the top of each 
panel, along with the B-band luminosity, bulge-to-total 
luminosity ratio in the B band and the B-V colour. Red- 
shift decreases down the trees, and the bottom of each plot 
represents the present day. Each branch in the tree rep- 
resents a progenitor fragment and its width at any epoch 
is proportional to the mass of stars in the progenitor at 
that epoch. Branches merge together when the fragments 
they represent merge. The plots have been normalised to 
have unit width at z = 0. These tree plots arc similar to 
those used by Baugh, Cole & Frenk (1996b) to illustrate 
the formation paths of galaxies of different morphological 
types (see their figures 2 and 3.) Galaxies that undergo 
major mergers at recent epochs are identified with ellipti- 
cals and SOs whereas galaxies that have grown quiescently 
by protracted accretion of cooling gas (perhaps around a 
bulge formed by a prior merger) are identified with spirals. 
Minor mergers that do not disrupt a stellar disk add stars 
to the bulge component (see Baugh et al. 1996b for precise 
definitions.) 

At the present day, the galaxy in the top left hand corner 
of Fig. 9 is a spiral, the galaxy at the bottom left is on the 
border between being an elliptical or SO galaxy, the galaxy 
at the top right is a field elliptical and the galaxy at the 
bottom right is a cluster elliptical. Lyman-break galaxies 
(marked by the star in each panel) can therefore end up 



in galaxies of any morphological type and, as we shall see 
in Fig. 1 1 , they can span a wide range of luminosity. 

The distribution of bulge-to-total stellar mass amongst 
the descendants of Lyman-break galaxies is similar to that 
of bright galaxies (Mb — 5 log ft, = —19) without such a 
progenitor. The distribution of halo circular velocity for 
the Lyman-break descendants, on the other hand, is bi- 
ased towards large values typical of groups and clusters 
(Fig. 10). This is just what was expected in view of the 
strong clustering bias exhibited by the Lyman-break galax- 
ies themselves (see §3.3.4). 

The luminosity function of the present day descendants 
of Lyman-break galaxies is plotted in Fig. 11 where it is 
compared with the luminosity function of the galaxy pop- 
ulation as a whole. In this figure we show results for both 
the standard CDM model A and the low-fi model G. In 
both cases, the bright end of the current luminosity func- 
tion is made up of galaxies which, at 2; ~ 3 had at least 
one progenitor that satisfied the luminosity and colour cri- 
teria required to qualify as a Lyman-break galaxy in the 
study of Steidel et al. (1995). Virtually all present day 
galaxies with L ^ 2.5i* have such a progenitor. The frac- 
tion of Lyman-break descendants decreases with decreas- 
ing luminosity so that virtually no present day galaxy with 
L ^ L,/5 was ever a Lyman-break galaxy of the type ob- 
served by Steidel et al. 

The assembly of the Lyman-break galaxies themselves 
is illustrated in Fig. 12 where we plot the growth of the 
stellar mass of selected Lyman-break galaxies with time. 
The stellar mass at each redshift (which may be spread 
amongst several fragments) is plotted as a fraction of the 
"final" stellar mass of the Lyman-break galaxy at 2; = 3 in 
Fig. 12(a). Star formation in the Lyman-break galaxies 
begins very early, at 2 > 6, but only ~ 20 — 40% of the 
stars have formed hy z = 4. The bulk of the stars present 
in these Lyman-break galaxies at 2; ~ 3 was formed in the 
preceeding few hundred million years. The total star for- 
mation rates shimmed over the fragments, in units of the 
star formation rate at 2; = 3, are plotted in Fig. 12(b). 
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FIG. 10 The distribution of present-day halo circular veloc- 
ity of galaxies in model A that contained at least one Lyman- 
break galaxy at high redshift (solid line), compared with the 
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= 5.4e-ll h-i.Mg = -2L6 B/T = 0.07 B-V = 0.50 = 1.4e+ll h-'Mp = -18.5 B/T = 0.77 B-V = 0.89 




FIG. 9 Tree diagrams illustrating the star formation histories of four present day galaxies which had a high redshift progenitor 
satisfying the luminosity and colour selection criteria of Steidel et al. (1995). The actual progenitor is marked by a star. These 
examples are taken from the standard CDM model A. The present day is at the base of each tree - the trees extend back to a 
redshift of 5. The width of each branch at any epoch is proportional to the mass in stars in the branch at that epoch. The trees 
have been normalised to have unit width at z = 0. The labels give the stellar mass of the final galaxy at the present day, along with 
its B band luminosity, the bulge to total light ratio in the B band and the B-V colour. 
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distribution of halo velocities for present day galaxies without 
such a progenitor (dashed line). All the present day galaxies 
considered axe brighter than Mb — 5 log ft = —19. 
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FIG. 11 Present-day B-band luminosity functions in model 
A (solid lines) and model G (dotted lines). In each case, the 
extended curve shows the luminosity function of the galaxy 
population as a whole. The shorter curve shows the luminosity 
function of galaxies that contained at least one progenitor sat- 
isfying the selection criteria for Lyman-break galaxies at high 
redshift in the study of Steidel et al. (1995). The data points 
show observational determinations of the luminosity function 
from Loveday et al. (1992) and Marzke et al. (1994). 





FIG. 12 The star formation histories of five example galax- 
ies that satisfy the Steidel et al. colour selection in model A. 
The galaxies all have dark halos of mass 2 x 10^^ Mq at 
z = 3. Panel (a) shows the build up in stellax mass expressed 
as a fraction of the mass in stars at 2 = 3. Panel (b) shows the 
instantaneous star formation rate in units of the star formation 
rate at z = 3. Curves of the same line style refer to the same 
galaxies in (a) and (b). 
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FIG. 13 Evolution of the comoving number density of galax- 
ies that have stellar masses in excess of 10^ and 10^'^h~^MQ. 
The solid line shows results for model A and the dotted line for 
model G. 



The biiild-up of the population of galaxies with masses 
typical of Lyman-break galaxies is illustrated in Fig. 13. 
Here we plot the evolution of the comoving number den- 
sity of galaxies that have stellar masses in excess of 10^ 
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FIG. 15 The distribution of star formation rates (number of star forming systems per logjQ (SFR) per comoving volume) at different 
redshifts, computed from our models and compared to observational data. Panel (a) shows model A, panel (b) model C and panel 
(c) model G. The solid, dashed and dotted lines show the model predictions for z = 0, 2.75 and 4 respectively. The symbols with 
error bars arc observational data points. The crosses arc for z = (Gallcgo ot al. 1995), the filled circles for {z) = 2.75 (Madau 
1996) and the open circles for (z) = 4 (Madau 1996). The observational data have been converted into total SFRs assuming either 
a Miller-Scalo or Salpeter IMF, as indicated on each panel and described in the text. The top scale shows the luminosity L(1500) 
expressed as an AB magnitude: Mab(1500) = -2.51og(L(1500)/ergs-^Hz-^) + 51.6. 




FIG. 16 The total star formation rate per comoving volume as a function of redshift, computed from our models (solid curves) 
and compared to observational estimates (symbols). Panel (a) is for model A, panel (b) for model C and panel (c) for model G. 

The filled triangle is based on the Ha luminosity function of Gallogo ct al. (1995); the filled diamond is from the 2000 A luminosity 
function of Treyer et al. (1997); the filled circles arc from the rest frame 2800A luminosity densities of Lilly et al. (1996); the open 
stars come from the 2800A fluxes of Connolly et al. (1997); the inverted triangles are from the 3000 A fluxes of Sawicki et al. (1997); 
and the filled squares come from the 1500A luminosity function of Madau (1996). The right hand scale shows the luminosity density 
at 1500A. The data points have been converted to total star formation rates assuming either a Miller-Scalo or Salpeter IMF as 
indicated on each panel. The open squares show the effects of a factor of 3 correction in Madau's (1996), z > 2, star formation 
rates per unit comoving volume due to dust obscuration, as suggested by Pettini et al. (1997). 
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and Mq. The number of galaxies above a given 

mass limit increases as star formation proceclcs, but it de- 
creases when mergers involving galaxies of this size occur. 
At 2; ~ 3, the abundance of galaxies with mass of a few 
times lO^h^^ Mq is rapidly rising. This is also close to 
the time when galaxies of = 10^'^ h^^ Mq first appear 
in significant numbers. Thus, in the cosmological models 
discussed in this paper, 2; ~ 3 is the first epoch at which 
galaxies form with stellar masses comparable to present- 
day i* galaxies. 




5. THE COSMIC STAR FORMATION HISTORY 

White fc Frenk (1991), Lacey et al. (1993), Cole et al. 
(1994) and Heyl et al. (1995) demonstrated that hierar- 
chical models of galaxy formation that are consistent with 
local data tend to form the bulk of their stars at relatively 
low redshifts (see also Baron & White 1987). This is a 
feature not only of the Oq = 1 CDM cosmology, but also 
of successful low-density CDM models. Fig. 14 shows the 
star formation histories predicted in our fig = 1 and low- 
O models (models A and G, indicated by solid and dotted 
lines respectively). In both cosmologies, 50% of the stars 
form after 2 ~ 1. The stars that have formed by 2; ~ 3 
account for less than 10% of the present day total; very 
little star formation occurs before 2 = 4. Note that in 
spite of the improvements to our galaxy formation model, 
the curve for model A in Fig. 14 is virtually identical to 
the curve for the fiducial model in Figure 21 of Cole et al. 
(1994) while the curve for model G agrees well with the 
results tabulated in Table 3 of Heyl et al. (1995). 

Observational data that can be compared with theoret- 
ical predictions for the cosmic star formation history are 
now becoming available (e.g. Lilly et al. 1996, Madau et 
al. 1996). In Fig. 15 we present a compilation of current 
data expressed as the comoving number density of galax- 
ies as a function of star formation rate (SFR) at different 
redshifts. Star formation rates are not, of course, directly 
observed but inferred from the flux in a restframe UV pass- 
band, a cosmological model to convert flux to luminosity, 
a model for the spectral energy distribution, and an as- 
sumption about the initial stellar mass function (IMF) . To 
intercompare different datascts amongst themselves and 
with our model predictions, we have derived SFRs from 
published data in a homogeneous manner. We present re- 
sults for both the Miller-Scalo and Salpeter IMFs. The 
SFRs in our models are total and include the contribution 
from brown dwarfs (ie stars with mass below the hydrogen- 
burning limit) which is parametrized by the factor T > 1 
(see Table 1). 



FIG. 14 Predicted star formation histories in model A (solid 
line) and model G (dotted line). The curves give the fraction 
of the final mass in stars that has formed by a given redshift. 



The 2 = data plotted in Fig. 15 were derived from the 
Ha huninosity hmction of Gallego et al. (1995). The high- 
redshift data in the figure were derived from the data pre- 
sented by Madau (1996) for galaxies in the Hubble Deep 
Field identified photometrically as Lyman-break galaxies. 
The C/300-band dropouts are estimated to have 2 < 2 < 3.5 
and (2) = 2.75; while the _B45o-band dropouts are esti- 
mated to have 3.5 < 2 < 4.5 and (2) = 4. Madau gives 
SFRs derived from broad-band magnitudes at wavelengths 
close to I500A in the rest frame. We first convert these 
back to the corresponding values of L(1500) = i^(1500A) 
using Madau's own conversion factor, as given in Madau 
et al. (1996). We then convert the 1500A and Ha lu- 
minosities into total SFRs using the values tabulated in 
Table 5 and the appropriate value of T for each model. 
In addition, for models with Oq 7^ 1 an approximate scal- 
ing has been applied to Madau's data points to account 
for the differences in the comoving volume element and 
luminosity distance in the different cosmologies. 

The curves plotted on Fig. 15 display our model pre- 
dictions. The upper panel shows results for our standard 
f2o = 1 model A and the lower panel for our flat low-fl 
model G, both of which have the Miller-Scalo IMF. The 
middle panel shows model C which has f^o = 1 and the 
Salpeter rather than the Miller-Scalo IMF. The theoretical 
curves arc exactly the same for this case as for model A, 
the only difference being the scalings applied to the ob- 
servational data points. In all cases the behaviour of the 
models is qualitatively the same as the observed data. The 
SFRs are larger at the intermediate redshift 2 = 2.75 and 
at 2 = 4 they drop back to values similar to those at 
2 = 0. The model that best reproduces the observed data 
is model A. There are, however, considerable uncertain- 
ties in this comparison. For example, the mean redshift of 
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Table 5 

The L(1500), L(2800) and Ha luminosities produced by a constant total SFR = TMqyi"^ after 1 Gyr. L{Ha) in units 
of ergs s-\ L(1500) and L(2800) in ergs-^Hz'^ 



band 


Millcr-Scalo 


Scalo 


Salpctcr 


L{Ha) 

L(1500) 

L(2800) 


1.38 X lO''^ 
1.34 X 10^8 
1.12 X 10^^ 


6.21 X 10""^ 
3.88 X 10^^ 
3.59 X 10^^ 


1.98 X 
8.68 X 10^^ 
6.76 X 10^7 



the UsQo dropouts in our mock HST catalogue is z ~ 2.3, 
smaller than the central value, (z) = 2.75, assumed for 
the real data. This approximation alone would lead to an 
overestimate of the inferred star formation rate by up to 
15%. For the -6450 dropouts this effect is smaller, around 
5%. More importantly, these comparisons are sensitive to 
the details of the adopted IMF, as may be seen by the way 
in which the data points shift in panels a and b. 

The cosmic star formation history may be conveniently 
summarised by considering the variation with redshift of 
the total SFR per eomoving volume, as in the theoretical 
predictions of Fig. 14. This quantity is obtained by inte- 
grating the differential distributions in Fig. 15 over galax- 
ies of all SFRs. For the theoretical models, this is straight- 
forward, but for the observations, it involves extrapolating 
the distribution of SFRs to ranges not directly observed. 
Comparing theoretical and "observed" total SFRs is thus 
considerably more uncertain than comparing differential 
distributions. With this caveat, we compare in Fig. 16 
our theoretical predictions with several observational esti- 
mates including those based on the data of Gallego et al. 
(1995) and Madau (1996) already described, and also the 
B-band data of Lilly et al. (1996) for 0.2 < z < 1. In all 
cases, wc have used the observers' estimates of the total 
luminosity density, which are based on fitting a Schechter 
function to the luminosity function and extrapolating it to 
all luminosities. The Lilly et al. data refer to the luminos- 
ity density at 2800A. This is less directly related to the 
instantaneous star formation rate than the Hq or 1500A 
luminosities, because it is dominated by somewhat older 
stars. The 2800A rest-frame luminosity is sampled by the 
observed B-band flux only at z ^ 0.5 — 1.0, so the estimate 
at z ~ 0.35 requires a modest extrapolation from longer 
wavelengths. This introduces an additional uncertainty. 
The constants used to convert L(2800) to total SFR are 
also listed in Table 5. The upper limit plotted at z = 5.5 is 
based on the number of Vaoe dropouts in the HDF, which 
are candidates to be Lyman-break galaxies at 5 < z < 6 
(Madau, private communication). 

After the original version of this paper was submitted, 
many more data points have been added to the star for- 
mation history diagram and we reproduce here a selection 
of them. At low redshift, Treyer et al. (1997) have esti- 
mated the star formation density from 2000A (rest-frame) 
fluxes; Sawicki et al. (1997) have used photometric red- 
shifts of galaxies in the HDF to infer the star formation 
density from 3000A fluxes; Connolly et al. (1997) have 



used both optical and ground based near-infrared imaging 
of the HDF to infer star formation rates from 2800A fluxes. 
(The use of infrared data is particularly important for the 
accuracy of photometric redshifts at z 1~2.) Apart from 
the Sawicki et al. points at z > 2, the level of agreement 
amongst these different determinations is remarkable, al- 
though it is suggestive that both the Sawicki et al. and 
Connolly et al. points at z < 1 lie above those from the 
CFRS survey, in better agreement with our model predic- 
tions. (The CFRS survey, however, has more galaxies and 
therefore smaller error bars at these redshifts.) 

Overall, the agreement between theoretical predictions 
and data in Fig. 16 is impressive. It must be borne in 
mind that these are genuine theoretical predictions that 
predate the observational data. The theoretical curve in 
the upper panel of Fig. 16 is simply the time derivative of 
the integrated curve for the fiducial CDM model plotted 
in figure 21 of Cole et al. (1994). Note, however, that 
the location of the observational data points depends on 
the assumed IMF; Cole et al. used a Scalo IMF whereas in 
§3.2, we found a Miller-Scalo or Salpeter IMF to be prefer- 
able. For a given luminous star formation rate, SFR/T, 
a Miller-Scalo IMF gives 2.2 times the Ha flux, 3.5 times 
the I500A flux and 3.1 times the 2800A flux compared to 
a Scalo IMF. Comparing panels (a) and (b) of Fig. 16, we 
see that the main effect of changing the IMF from Miller- 
Scalo to Salpeter is to move the z = data point based on 
the Ha luminosity. Both models A and G show the same 
qualitative trend as the data, with a broad peak in the 
total SFR at 1 ^ z ^ 2. For z ;^ 2, the model SFRs fall off 
somewhat more slowly than the data. The completeness 
of the observational data at these high redshifts, however, 
is difficult to establish. 

A further source of uncertainty is the possible pres- 
ence of dust in the star-forming galaxies. Even a modest 
amount of dust would cause attenuation of the ultraviolet 
flux, leading to a potentially severe imderestimate of the 
star formation rate. Tentative detections of the cosmic 
infrared background by Puget et al. (1996) and Guider- 
doni et al. (1997a) and upper limits on it (Kashlinsky, 
Mather & Odenwald 1996) provide only weak constraints 
on the amount of dust present in high redshift galaxies 
(Madau, Pozzetti & Dickinson, 1997; Guiderdoni et al. 
1997b). Monolithic collapse models (e.g. Eggen, Lynden- 
Bell & Sandage 1962) in which a significant fraction of the 
total star formation in the universe takes place at high 
redshifts enshrouded in dust (e.g. Meurer et al. 1997), 
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appear to overpredict the total mass of heavy elements in 
place at early times (Madau et al. 1997), as inferred from 
observations of damped Lyman-a systems (Pettini, Smith, 
King & Hunstead 1997). Estimates of the factor by which 
star formation rates deduced from UV flux should be re- 
vised to account for the presence of dust span a range 
of values. The results are sensitive both to the form of 
the extinction law adopted and to the assumed age of the 
primeval galaxy which determines how intrinsically blue 
it is. Primeval galaxies in our models are not, in general, 
ultraluminous starbursts since they never experience ex- 
ceptional star formation rates. In this case, Dickinson et 
al. (in preparation) and Pettini et al. (1997) argue that 
the likely correction is around a factor of 1.8 — 3 at 2 ~ 3 
for star formation rates inferred from 1500A fluxes. These 
corrections are a factor of ~ 5 smaller than those advo- 
cated by Meurer et al. (1997). In Fig. 16, we illustrate 
the effect of the Pettini et al. correction at high redshift 
by multiplying the points of Madau et al. (1996) by a 
factor of 3 (open squares) . The correction appropriate to 
the lower redshift points is also uncertain and we do not 
attempt to illustrate it in Fig. 16. It is likely to be smaller 
than at high redshift since the star formation rates are 
derived from longer wavelength data. 

A related observational constraint on the evolution of 
the galaxy population comes from observations of neutral 
hydrogen at high redshift using quasar absorption lines. 
Fig. 17 compares the evolution of the cold gas fraction 
in our models with estimates by Storrie-Lombardi et al. 
(1996), derived from the statistics of damped Lyman-alpha 
absorption lines. Whereas Kauffmann's (1996b) semi- 
analytic models agree quite well with these data, our own 
models agree only in the qualitative sense that the comov- 
ing cold gas density has a broad peak at a redshift z = 2- 
3. Our models predict consistently more cold gas than is 
inferred from the observations. On the other hand, the 
observational results may underestimate the total cold gas 
density in galaxies because (i) they only include atomic 
hydrogen at column densities Nh > 2 x lO^^cm"^, and 
do not include ionized or molecular gas at all; and (ii) 
dust obscuration may cause some absorption systems to be 
missed. Regarding (i), all the gas in our model galaxies at 
T <^ IQ'^K is counted as "cold" ; the correction for ionized 
gas and for low column-density HI [Nh < 2 x 10^°cm~^) 
is probably not large, but the correction for molecular hy- 
drogen might be significant. Regarding (ii), the chemical 
evolution models of Pei & Fall (1995) suggest that because 
of dust obscuration, the true neutral hydrogen density is 
2-3 times higher than the "directly measured" value, mov- 
ing the observational points in Fig. 17 much closer to the 
theoretical curves. The dotted-line set of errorbars in Fig. 
17(a) show plausible corrections for these effects, using the 
output from one of the models of Pei & Fall (1995), fol- 
lowing Figure 2 of Storrie-Lombardi et al. (1996). 
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FIG. 17 The total comoving density of cold gas, pcg, in 
units of the present critical density. The curve in the upper 

panel shows the evolution of peg with redshift in model A. The 
dependence of Pcg with redshift in models B and C which differ 
from A only in the choice of IMF, are identical to that in A. The 
curve in the lower panel shows results for model G. The data 
points are observational estimates, based on damped Lyman- 
alpha absorption lines from Storrie-Lombardi et al. (1996). We 
have applied an approximate scaling to their = 1 estimates 
to derive the corresponding values for the flat low-f2 model G. 
The data point at 2; = is based on the HI luminosity func- 
tion of nearby galaxies derived from 21cm observations. The 
dotted-line errorbars in the upper panel show the corrections 
to the data suggested by Storrie-Lombardi et al. to account 
for the effects of incompleteness and of dust obscuration, using 
a model from Pei & Fall (1995). We have retained the same 
fractional errors on the 'corrected' data points. 
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FIG. 18 The ratio of the number density n-^ of ionizing pho- 
tons produced per comoving volume to the number density uh 

of hydrogen atoms. At the rodshift at which n-f/n-a — 1, just 
enough ionizing photons iiave been produced to ionize every 
hydrogen atom exactly once. The solid line is for model A, the 
dotted for model G and the dashed for model C. 



An interesting issue is whether massive stars in young 
galaxies can, on their own, produce enough ionizing pho- 
tons (A < 912A) to re-ionize the IGM by z = 5. (Ad- 
ditional ionizing photons are produced by quasars, and 
possibly by pregalactic stars formed at high redshift via 
molecular hydrogen cooling, e.g. Tegmark et al. 1997). 
A very simple criterion for establishing if re-ionization is 
possible by redshift z is based on the ratio n-y/nn, where 
Hj is the total number of ionizing photons produced per 
comoving volume up to redshift z, and Uh is the total co- 
moving density of hydrogen atoms. When n^/nn — 1, just 
enough ionizing photons are produced to ionize each hy- 
drogen atom exactly once. This criterion neglects absorp- 
tion of photons within the emitting galaxies, recombina- 
tion of hydrogen in the IGM, and depletion of the IGM by 
collapse of gas into dark halos. In particular, the fraction 
/esc of ionizing photons that escape from a galaxy may 
be quite low: Dove & Shull (1994) estimate /esc ~ 10% 
for our own galaxy, while Leithcrcr et al. (1995) estimate 
(/esc) <, 3% from far-UV observations of nearby starburst 
galaxies. The photon number density, n-y, is also sensitive 
to the form of the IMF above IOMq. Fig. 18 shows the 
dependence of n^/nn on redshift for models A, C and G. 

6. DISCUSSION 

We have used a semi-analytic model of galaxy formation 
to interpret recent data on galaxy formation and evolution 
within the context of hierarchical clustering theories, fo- 
cussing primarily on the properties of the recently discov- 
ered population of Lyman-break galaxies at 2: ~ 3. Our 



modelling technique allows us to identify the role that this 
population plays within the general scheme of galaxy for- 
mation, and to relate these observations to other data at 
lower redshifts. Our models are quite general, but they 
inevitably require a number of assumptions and simpli- 
fications most of which, in fact, reflect our poor under- 
standing of the processes of star formation and feedback. 
Within these limitations we attempt to represent the rel- 
evant physics using scaling laws that involve the smallest 
possible number of free parameters. Our modelling strat- 
egy is based on fixing these free parameters by requiring 
that the models should match a small subset of the local 
data, particularly the field B-band galaxy luminosity func- 
tion. Thus specified, the models possess predictive power 
and can be tested against high redshift data. 

The specific cold dark matter models that we have con- 
sidered all share the feature that they reproduce the ob- 
served abundance of present day rich clusters of galaxies. 
This fixes the amplitude of primordial density fluctuations 
which determines the epoch at which structures on any 
mass scale form. Current large-scale structure data al- 
low a range of values for the cosmological parameters 
and Aq. By way of illustration, we have explored in de- 
tail a critical density model and two low-density models, 
one open and the other flat. A second common feature 
of the models we have considered is that they all agree 
at some level with most data on the evolutionary proper- 
ties of galaxies at relatively modest redshifts, z < I, such 
as the evolution of the luminosity function (Baugh et al. 
1996a), and the counts of galaxies as a function of magni- 
tude, morphological type, and redshift (Cole et al. 1994, 
Baugh et al. 1996b, Frenk et al. 1996). Several inter- 
esting predictions of the models have been corroborated 
by subsequent data, including, for example, the redshift 
distribution of B = 24 counts (Frenk et al. 1997). 

An important prediction of semi-analytic models which 
also predated the acquisition of the relevant data is the 
cosmic star formation history, first discussed by White & 
Frenk (1991) and calculated by Cole et al. (1994) and Heyl 
et al. (1995) for the specific models discussed here (c.f. 
Fig. 14). We showed in Section 5 that Madau's (1996) 
recent data agree well with these model predictions, al- 
though uncertainties remain in this comparison because 
of the unknown effects of dust obscuration and possible 
incompleteness in the observational samples at high red- 
shifts. In particular, the possibility that a significant frac- 
tion of the total star formation may have been missed in 
recent surveys if it occured in dust enshrouded starbursts 
at very high redshift has been the subject of some recent 
debate. Such a population is not predicted in our mod- 
els, although a certain amount of dust obscuration at high 
redshift can be accommodated and may even be required: 
our models, in fact, predict star formation rates which are 
somewhat higher than those inferred from the data un- 
corrected for dust obscuration (c.f. Fig. 16). The star 
formation rate in our models peaks around z = 1 — 2, but 
it never varies by more than an order of magnitude over 
the entire range < z < 6. In fact, the star formation 
rate at z ~ 5 is almost identical to the star formation 
rate at the present day. Nevertheless, half of all the stars 
present today only formed since z < 1. The significance of 
the Lyman-break galaxies in the context of these models 
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lies in the fact that at the epoch when these galaxies arc 
observed, the process of star formation is just beginning 
in earnest. Thus the Lyman-break galaxies are the first 
massive objects to sustain appreciable star formation and, 
in this sense, they signal the onset of galaxy formation. 

The distribution of star formation rate at different red- 
shifts provides even stronger constraints on models than 
the evolution of the integrated star formation rate. Our 
models match existing data quite well, and further obser- 
vational measurements of this fundamental quantity are 
very important. In both models and observations, the qui- 
escent star formation rates are relatively low even in the 
largest protogalaxies, with the majority of galaxies never 
forming stars at rates exceeding a few solar masses per 
year. The exception to this are galaxies that undergo a 
burst of star formation as a result of experiencing a major 
merger. However, predictions for the strength, duration 
and frequency of such bursts will require more detailed 
modelling that we have attempted so far. 

In our models, the Lyman-break galaxies form from rare 
peaks in the density field at high redshift. As a result, we 
predict that their spatial distribution at 2; ~ 3 should be 
strongly biased relative to the underlying mass, with a 
typical bias parameter, 6 ~ 4, and a comoving cluster- 
ing length, ro ~ 4/i^^Mpc. Generically, we expect the 
Lyman-break galaxies to be rotating disks, and a simple 
model for the origin of their angular momentum predicts 
typical half-light radii of ~ 0.5fi~^kpc. The Lyman-break 
galaxies seen at 2; ~ 3 evolve into the present day ordinary 
ellipticals and spirals that make up the bright end of the 
luminosity function. Their stars will typically be concen- 
trated in the central regions of their descendants. These 
descendants are to be found preferentially in groups and 
clusters, reflecting their biased origin and strong clustering 
at high redshift. 

The appearance of the first protogalaxies at 2 ~ 3.5 and 
the late conversion of most of the gas into stars fit in well 
with the observation that the neutral hydrogen c;ontcnt of 
the universe, as determined from measurements of damped 
Lyman-alpha clouds, peaks at around z = 3 and declines 
thereafter (Storrie-Lombardi et al. 1996). The neutral 
hydrogen density in our models also exhibits this overall 
behaviour and agrees reasonably well with the data once 
corrections for incompleteness and a small amount of dust 
obscuration are included. Related semi-analytic models 
by Kauffmann (1996b) agree even better with these data. 
Our models are consistent with Madau's (1996) view that, 
with the discovery of Lyman-break galaxies, the bulk of 
the star formation (and the attendant metal production) 
in our universe has, in effect, been identified. Only a small 
fraction of the star formation activity remains to be de- 
tected during the "dark ages" prior to z = 4. Characteris- 
ing such activity is, of course, of great importance for test- 
ing the general view that galaxies formed by hierarchical 
clustering. According to our models, the small percent- 
age of stars that formed prior to the Lyman-break galaxy 
epoch produce enough radiation to make at least a signifi- 
cant contribution to the UV flux required photoionize the 
intergalactic medium by z ~ 5. 

Although a late epoch of galaxy formation in the stan- 
dard CDM model was predicted long ago (Frenk et al. 
1985, White &: Frenk 1991), a surprising result of our anal- 



ysis is that this is also true of the now popular low-density 
variants of this model. Indeed, the star formation histo- 
ries of the Qo = 1 model and the fiat fio = 0.3 model 
are remarkably similar. This is largely coincidental: the 
detailed star formation histories depend not only on the 
shape of the power spectrum, but also on its normalisation 
and on the way in which star formation and feedback are 
implemented in our galaxy formation models. The main 
conclusion of our analysis is that, regardless of the exact 
values of the cosmological parameters, CDM models that 
approximately reproduce the abundance of Lyman-break 
galaxies, require massive galaxy formation to begin around 
z~ 3.5. 

In summary, we have argued in this paper that the main 
ingredients of a consistent picture of galaxy formation may 
now be in place. The key observation that has unlocked 
this paradigm is the discovery of a large population of 
star-forming galaxies at 2; ~ 3 which signal the onset of 
the epoch of galaxy formation that extends well into the 
present day. At this time, data and theoretical modelling 
paint only a broad brush picture of how galaxy formation 
may have occurred. Fortunately, if this emerging picture 
is correct, the details should also be accessible to current 
observational and modelling capabilities. 
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